Quantifying evolvability in small biological networks 
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We introduce a quantitative measure of the capacity of a small biological network to evolve. We 
apply our measure to a stochastic description of the experimental setup of Guet et al. (Science 
296:1466, 2002), treating chemical inducers as functional inputs to biochemical networks and the 
expression of a reporter gene as the functional output. We take an information-theoretic approach, 
allowing the system to set parameters that optimize signal processing ability, thus enumerating 
each network's highest-fidelity functions. We find that all networks studied are highly evolvable 
by our measure, meaning that change in function has little dependence on change in parameters. 
Moreover, we find that each network's functions are connected by paths in the parameter space 
along which information is not significantly lowered, meaning a network may continuously change 
its functionality without losing it along the way. This property further underscores the evolvability 
of the networks. 



Many signals in cells are processed using a network of 
interacting genes: exogenous signals affect expression of 
genes coding for transcription factor proteins, which in 
turn regulate the expression of other genes. Although 
early works have suggested that the connectivity of such 
regulatory networks dictates their function [TJ [21 12], re- 
cent studies offer evidence that a network with fixed con- 
nectivity can change its function simply by varying its 
biochemical parameters |l|51ll|7j. The diversity of a 
network's achievable functions and the ease with which 
it can realize them are central to its capacity to evolve 
epigenetically, without slow and costly modifications to 
the genetic code, and thus central to the evolutionary 
capacity of the organism as a whole. 

The evolvability of a regulatory network has been a 
topic of much discussion in recent literature [9j [10J 
[TTJ [12], but little has been done to quantify the con- 
cept in a principled way. Here we propose a quantitative 
measure of network evolvability, and we apply it to a 
set of small regulatory networks, such that a principled 
comparison can be made across networks. Networks are 
taken from the experimental setup of Guet et al. [4] and 
modeled stochastically. We find biochemical parameters 
that optimize the information flow between a chemical 
"input" signal and a particular "output" gene, and we 
indeed find that a single network performs different func- 
tions at different sets of optimal parameters. We argue 
that a more evolvable network will be able to access a 
richer diversity of its functions with smaller changes in 



its parameters, and as such we quantify evolvability us- 
ing a measure of anti-correlation between parametric and 
functional change. 

We find that while there are small differences among 
networks' evolvability scores, all are highly evolvable, 
meaning that the magnitude of a functional change has 
little dependence on the parametric change required to 
produce it. Moreover, we find that transitions among a 
network's optimally informative functions can be made 
without significant loss of the input-output information 
along the way. By proposing and demonstrating a princi- 
pled evolvability measure, we reveal these features quan- 
titatively; both features suggest a high capacity of the 
studied regulatory networks to evolve. 



METHODS 

First we briefly outline the methods used to develop 
a quantitative measure of evolvability; each step is dis- 
cussed in more detail in the sections that follow. The sys- 
tem of interest is a small (4-gene) transcriptional regula- 
tory network. As in Guet et al. [I], we treat the presence 
or absence of chemical inducers (small molecules that af- 
fect the efficacy of the transcription factors) as the inputs 
to the network, and the expression of a particular gene as 
the output. We use a stochastic model to find expression 
level distributions at a steady-state, and we search for the 
model parameters which maximize the mutual informa- 



2 



tion between input and output. We characterize the func- 
tion that the network performs by the order in which the 
output distributions corresponding to each input state 
are encoded [13) ; a single network can perform different 
functions at different parameter settings. We define the 
evolvability of the network as the ability to perform a 
diverse set of functions with only small changes in pa- 
rameters, and we quantify evolvability accordingly using 
a measure of anti-correlation between pair-wise function 
distance and parameter distance. 



Model 

Following the experimental setup of Guet et al. [I] , we 
study all networks that can be built out of three genes 
A, B, and C, in which each gene is regulated by one 
other gene, and regulation edges can be up-regulating 
or down-regulating. Additionally, as in the experiment, 
gene C down-regulates a "reporter" gene G (e.g., GFP), 
whose expression we treat as the functional output of the 
network. This yields a total of 24 networks, as shown on 
the horizontal axis of Figure [2] Also in analogy to the 
experiment, the efficacy of each transcription factor can 
be inhibited by the presence of a chemical inducer s, a 
small molecule that binds to the transcription factor and 
lowers its affinity for its binding site. The presence or 
absence of the chemical inducers sa, sb> and sc corre- 
sponding to each transcription factor A, B, and C define 
the functional input state of the network. The inhibitory 
effect of each inducer is illustrated for an example net- 
work in the top panel of Figure[T]A and the eight possible 
input states i, determined by the presence or absence of 
the inducers, are listed in the bottom panel. 

For a typical regulatory network inside a cell, intrin- 
sic noise arising from fluctuations in the small numbers of 
species |14l 115] is the primary factor limiting transmission 
of information from a chemical input to a genetic out- 
put [FJ EH HH [T7]. This observation has two important 
consequences for modeling: (a) a realistic model should 
capture not just mean protein concentrations, but proba- 
bility distributions over numbers of proteins [T51 [H5] , and 
(b) the most biologically relevant model parameters will 
retain an optimal flow of information in the presence of 
this noise [3 HH [21] . 

The first consequence is most fully addressed by solv- 
ing the chemical master equation [22], which describes 
the time-evolution of the joint probability distribution 
for the numbers of all molecules in the system given the 
elementary reactions (which are ultimately a function of 
the network topology). For our systems, the master equa- 
tion is not analytically solvable. Progress can be made 
either by Monte-Carlo simulation of the master equation 
[2"3"] , or by approximating the master equation, e.g. with 
the linear-noise approximation (LNA) [51 IT31 P22l I2H 123] . 
Since it does not rely on sampling, the LNA is much more 



computationally efficient (and thus more amenable to a 
search for high-fidelity model parameters) , and in previ- 
ous work [5j we found the distributions obtained via the 
LNA were practically indistinguishable from those ob- 
tained via stochastic simulation for copy numbers above 
10 - 20. 

In the LNA, the reaction rates in the master equation 
are linearized, and the steady-state solution is a multi- 
dimensional Gaussian distribution [5J [T3J H2] . The indi- 
vidual species' marginal distributions are thus described 
at the level of Gaussian fluctuations, with means given 
by the steady-state solution to the deterministic dynam- 
ical system describing mean protein numbers. Mean ex- 
pression has been modeled with remarkable success by 
combining transcription and translation into one step 
[2"o1 |2"T1 12"5] . and accordingly, for each of our networks, 
we use the following dynamical system in which species 
are directly coupled to one another, 



dt 



aj{X Vj /s^,) - RjXj, 



(1) 



where the Xj £ {A, B, C, G} are the expression levels (in 
units of proteins per cell) of the four genes, the Rj are 
degradation rates, and the aj are production rates which 
depend on the expression level X 1Tj and a scale factor s n . 
of the parent ttj of gene j (where the parent-children con- 
nectivity is determined by the network topology). The 
scale factors, s n . € {sa, sb, sc} > L incorporate the in- 
hibitory effect of each chemical inducer (when present) 
by reducing the effective transcription factor concentra- 
tion; when an inducer is absent, its scale factor is set to 
1. Regulation is modeled using Hill functions, 



aj (x) 



_ (ao + aj x , t +( K )n up - regulating, 



a o + a j x^+Ik^ down - regulating, 



(2) 



where ao (kept the same for all genes) is the basal pro- 
duction rate, ao + aj is the maximal production rate, Kj 
is the binding constant or the protein number at which 
production is half-maximal, and n is the cooperativity, 
which we set to 2. Note from Eqns. (l][2) that increas- 
ing the scale factor can be equivalcntly interpreted as 
increasing the effective binding constant or lowering the 
binding affinity. Steady states of Eqn. Q are found by 
solving (using MATLAB's roots) the polynomial equa- 
tions that result from setting the left-hand side to zero, 
and keeping only those solutions for which the Jacobian 
matrix J of Eqn. has eigenvalues whose real parts are 
all negative. 

The variances of the marginal distributions are the di- 
agonal entries in the covariance matrix S, which under 
the LNA satisfies a Lyapunov equation, 



JS + 3 J T 



D = 0, 



(3) 



where I? is a diagonal matrix with, for the system in Eqn. 
([Tj), the jth entry equal to aj(X- Kj j s^^ + RjXj. Eqn. 
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^ is solved using a standard Lyapunov solver (MAT- 
LAB 's lyap). Since the steady-state distributions are 
Gaussian under the LNA, the solution is fully specified 
by the means and the variances. The distributions of par- 
ticular functional importance are P(G\i), the probability 
of expressing G reporter proteins per cell given that the 
chemical inducers are in state i. 

To address the second consequence, that biologically 
relevant solutions often optimize information flow in the 
presence of intrinsic noise |20l 121] , as in previous work 
[5] we allow the system to set parameters that maximize 
the mutual information / [S3] between input state i and 
output expression G, where 



I = YlJ dGP(i,G)log 



P(i,G) 
2 P{i)P{G) 



!'! 



/ dGp(G\i)log 2 



i\P(G\i) 



(4) 
(5) 



Here I is measured in bits, and the second step uses 
P(i,G) = P(G\i)P(i), P(G) = Y^i>P(i',G), and an as- 
sertion that each input state occurs with equal likelihood 
(i.e., P(i) = where \i\ = 8 is the number of input 

states) to write / entirely in terms of the model solutions 
p {G\i). 

Two computationally trivial ways for the system to 
maximize / are (a) to use an unbounded number of re- 
porter proteins G to encode the signal, and (b) to set 
degradation rates such that G responds on a timescale 
much longer than that of the upstream genes (called a 
"stiff" system) , which has the effect of averaging out the 
upstream noise. In contrast, in cells, protein production 
requires energy, which sets a limit on the number of pro- 
teins that a cell can produce, and most protein degra- 
dation rates are comparable. Therefore we seek model 
parameters 8* that optimize a constrained objective func- 
tion, 

6* = argmax,- [I - X(Xj) - 1 {R 7Tj ) /R G ] , (6) 

where the constants A and 7 are a metabolic cost and a 
constraint against stiffness, respectively, the average (Xj) 
is taken over all genes, and the average (R Vj ) is taken over 
upstream genes A, B, and C. Optimization is performed 
using a simplex algorithm (MATLAB's f minsearch) in 
a 15-dimensional parameter space, as 

8 = {a ,aA,aB,ac,a G ,KA,K B ,Kc,KG, 

Ra,Rb,Rc,sa,sb,s c } (7) 

(Rg was fixed at 4 x 10 _4 s _1 to set a biologically realistic 
degradation rate scale). 

Varying initial parameters yields many local optima 8* 
at which the input signal may be encoded differently in 
the output distributions P(G\i). For example, two opti- 
mally informative solutions are shown in Figure IB for 



the network in Figure [T|\. Intuitively, maximizing mu- 
tual information has resulted in sets of distributions that 
are well separated, such that knowledge of the output 
G would leave little ambiguity about the original input 
state i. We point out, however, that the ordering of the 
output distributions is different between the two solu- 
tions, meaning that the network is performing two differ- 
ent functions at two different points in parameter space. 
The relationship between diversity of such functions and 
exploration of parameters is crucial to the discussion of 
evolvability; in the next section we develop a quantitative 
measure of evolvability in the context of this system. 



Quantifying evolvability 

As seen in several experimental and numerical stud- 
ies [H OJ El [7], and in data from the model described 
above, a single regulatory network can perform different 
functions simply by varying its biochemical parameters. 
Intuitively, a network should be deemed more evolvable 
if it is able to access a richer diversity of its functions 
with smaller changes in its parameters. Quantification of 
this concept requires definitions of both parametric and 
functional change. 

As in Barkai et al. |301 , we characterize the magnitude 
of the parametric change in going from one model solu- 
tion to another by calculating fold-changes in the model 
parameters. Specifically, we define a parameter distance 
A8 between two solutions as the Euclidean distance in 
the logs of the parameters, 



A0 = I log 2 61 - log 2 0* 2 



\ 



'" r 

D 

k=l 



log 2 



'iJOlk) > (8) 



where \k\ = 15 is the number of parameters. Under this 
definition, equal fold-changes in each parameter consti- 
tute equal contributions to A8 (for scale, the doubling of 
one parameter corresponds to A8 =1). 

As in previous work |13j and in the original exper- 
iment of Guet et al. [1], we define the function of a 
network analogously to logic in electrical circuits (AND, 
OR, XOR, etc.), in which the function is determined by 
the magnitude of the output's response to each input 
state (for example, with two inputs, AND would be de- 
fined by a "high" output in response to the [++] state, 

and a "low" output in response to the [ ], [ — h], and 

[H — ] states). Since, in our setup, optimizing information 
produces well-separated output distributions P(G\i) (see 
Figure^), we extend this idea beyond a simple "high" 
or "low" output classification, and characterize function 
by the order of the P(G\i). Specifically, we record a vec- 
tor r of ranks of the P(G\i)\ for example, in the top 
panel of Figure [TJ3, the first output distribution (i — 1) 
is ranked 4th, the second (i = 2) is ranked 7th, the third 
(i = 3) is ranked 1st, and so on, so the rank vector is 
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r = (4, 7, 1, ... ). We then define the function distance 
Af between two solutions in terms of the vector distance 
between their rank vectors, 

1 1 N 

A/=-|rl-r 2 | 2 = -5> M -r 2 ,,) 2 (9) 

»=i 

(for scale, the swapping of two adjacent output distribu- 
tions corresponds to A/ = 1) [3D]. Other function dis- 
tances, including other permutation distances between 
the rank vectors, and a continuous distance measure de- 
fined by averaging the Jensen-Shannon divergence |31j 
between corresponding output distributions in the solu- 
tion pair, produced similar results, as discussed in the 
Results section. 

It is now clear that, if a network is better able to ex- 
plore its function set with smaller changes in its parame- 
ters (i.e., is more evolvable by our definition), then it will 
exhibit less correlation between A/ and AO than other 
networks. Therefore we define an evolvability score E for 
a given network as a measure of anti-correlation between 
A/ and AO, calculated for every pair of its optimal model 
solutions (41]. Specifically, 

£=1-(t+1)/2, (10) 

where r is Kendall's tau [32 , a nonparametric measure of 
correlation between all pairwise Af and AO; we rescale r 
such that < E < 1 and take its complement to obtain 
an anti-correlation. Using a nonparametric correlation 
statistic has the advantage that our evolvability measure 
remains invariant upon any monotonic rescaling in the 
definitions of either A8 or Af. Additionally, we note 
that E can be thought of as the probability that a pair of 
solutions drawn at random have a larger Af than another 
pair given that the first pair had a smaller AO, or as the 
fraction of discordant pairs of {AO, Af) data points [42"] . 

Function distance Af vs. parameter distance AO for all 
pairs of model solutions is plotted in Figure [Tp for the 
example network in Figure [l]A.. The evolvability score 
calculated from these data is E = 0.482 which, since 
there is little correlation (or anti-correlation) between Af 
and AO in this case, is near the middle value E = 0.5. 

We obtain a fairer estimate of E and an estimate of its 
error by subsampling. Specifically, in the spirit of Strong 
et al. [33], we compute the mean E and standard error 
8E in E values calculated on randomly drawn subsets 
of a given size n (from the full data set of size N). We 
then repeat for various n, plot E ± SE vs. N/n, and 
fit with a line (all plots generated were roughly linear). 
The value and uncertainty of the N/n = intercept give 
an estimate of E, extrapolated to infinite data, and a 
measure of sampling error, respectively. The sampling 
error estimated in this way for the data in Figure [lp is 
0.001. 



RESULTS 
All networks studied are evolvable 

Using the methods described above, between 200 and 
500 optimally informative model solutions were obtained, 
and an evolvability score E was calculated for each of the 
24 networks shown on the horizontal axis of Fig. [2] The 
constraints were set to A = 0.01 or 0.005, for an average 
protein count of ~100 — 200, and 7 = 0.001, allowing a 
maximum of about 3 orders of magnitude between up- 
stream and reporter degradation rates. Solutions with 
mutual information values below 1 = 2 bits were dis- 
carded as not transmitting high enough information (for 
scale, a solution with perfectly overlapping output dis- 
tributions would have 1 = bits, and a solution with 8 
perfectly non-overlapping output states would have 1 = 3 
bits). 

Networks' evolvability scores are shown in Fig. [2] All 
24 networks had E values within 5% of 0.5 (recall that 
E is bounded by < E < 1), which means that, in all 
cases, there is little correlation between change in func- 
tion and change in parameters, suggesting that all net- 
works studied are evolvable. Using other function dis- 
tances, including other permutation distances between 
the rank vectors, and a continuous distance measure de- 
fined by averaging the Jensen-Shannon divergence [31] 
between corresponding output distributions in the solu- 
tion pair, produced similar results: E scores were very 
near 0.5, indicating little correlation between functional 
and parametric distances. 

The claim that function has little dependence on pa- 
rameters can be tested more rigorously by comparison 
with a null hypothesis. The null hypothesis that func- 
tion is independent of parameters was implemented in 
two ways. First, given each network's solution set, loca- 
tions of solutions in parameter space were kept the same, 
but the functions associated with each solution were ran- 
domly permuted. Second, locations of solutions in pa- 
rameter space were again kept the same, but functions 
were drawn randomly from the set of possible functions 
for each network [?3] . In each case, the function reassign- 
ment was performed many times, and the E value was 
computed each time to produce a distribution of null E 
scores. There was no correlation between the means or 
variances of the networks' null distributions and their ac- 
tual E scores, so the individual null distributions were av- 
eraged across networks. Averaged null distributions from 
each of the two implementations are qualitatively similar, 
and both are shown in Figure [2] All networks' actual E 
values lie well within both null distributions (the small- 
est p-value is 0.023, and, with 24 networks, we expect at 
least one to attain a p-value lower than 1/24 = 0.04 sim- 
ply by chance). This means that none of the networks' 
solution sets significantly differ from a set in which the 
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FIG. 1: Defining evolvability. A Top: a sample regulatory network (see Figure [5] for diagrams of all 24 networks studied). A, 
B, and C are genes whose transcription factors regulate each other's expression according to the given network topology, and G 
is a "reporter" gene, such as GFP. Sharp arrows indicate up-regulation, while blunt arrows indicate down-regulation (all arrows 
are blunt in this network), sa, sb, and sc are chemical inducers that reduce the efficacy of the corresponding transcription 
factors. Bottom: Table showing the 8 input states i that are defined by the presence or absence of each chemical inducer in 
the cell (+ indicates presence, — indicates absence). In the model, the sa, sb, and sc are scale factors that are free parameters 
(greater than 1, to effect an interference with transcriptional regulation) if the inducer is present, and are set to 1 if the inducer 
is absent. B: Two maximally informative functions performed by the sample network at two different parameter settings. 
Function is characterized by the order of the output distributions P(G\i), the probability of expressing G proteins per cell given 
that the system is in input state i. Specifically, the function is quantified by the vector f of ranks of the P(G\i), as shown for 
each function in the upper right corner. For example, in the top function, the first output distribution (i = I) is ranked 4th, 
the second (i — 2) is ranked 7th, the third (i = 3) is ranked 1st, and so on, so the rank vector is f = (4, 7, 1, . . . ). C: Plot 
of function distance A/ [Eqn. |9|] vs. parameter distance Af? [Eqn. |8|] for all pairs of maximally informative model solutions 
(340 solutions were used for this network). Function distance is sea ed such that the swapping of two adjacent output states 
from solution one to solution two gives A/ = 1, and parameter distance is scaled such that the doubling of one parameter from 
solution one to solution two gives A8 = 1. The point corresponding to the pair of functions in B is circled. The evolvability 
score for this network, calculated from these data via Eqn. (10 1, is E = 0.482 ± 0.001. 



function performed is independent of the setting of the 
parameters. 

Even though all E values lie within the null distribu- 
tion, only two lie above the null mean of 0.5; the prob- 
ability of this happening by chance is 2 x 1CP 5 . For a 
network with E much larger than 0.5, the parameter 
and the functional distances would be anti-corrclated, 
and the network function would evolve dramatically with 
very small parameter changes. Thus the vast majority of 
the networks studied show a statistically significant, yet 
unintuitive ly small, positive correlation among the func- 
tional and the parametric distance. 

Despite the fact that the E values lie in a narrow range, 
sampling errors are small (see Fig. [2]), meaning that the 
networks can be ranked with some confidence accord- 
ing to their evolvability. We asked statistically whether 
this ranking was correlated with any topological features 
of the network, including the sign of the regulation of 
each gene, the length and net sign of the feedback cy- 
cle, and the total number of activators and repressors 
in the network, both in and out of the cycle. Correla- 
tion was tested for features with categorical values using 
a Wilcoxon rank-sum test j34j [35] (for two categories) 
or a Kruskal-Wallis -ff-test [35] (for more than two cate- 
gories), and for features with real values using Kendall's r 
[52] . No topological feature significantly correlated with 



E. The lowest p- value was 0.04, and, since many correla- 
tions were tested for at once, a Bonferoni correction [3"T] 
showed that the likelihood of obtaining a p-value this low 
simply by chance was 0.33. Thus we identified no topo- 
logical aspect that significantly imparted higher or lower 
evolvability to the networks. 



Changing functions without losing functionality 

As described in the previous section, we have found 
that the networks studied organize their optimally infor- 
mative solutions in parameter space in such a way that 
change in function is largely independent of change in 
parameters. We further demonstrate here that the net- 
works can change from one function to another in param- 
eter space without significant loss of the input-output in- 
formation along the way. This further underscores the 
evolvability of these networks, since it shows that ran- 
dom steps in parameter space not only explore the full 
variety of a network's functions, but do so without sig- 
nificant loss of fidelity. In the context of electric logical 
circuits, such evolvability would correspond to an ability 
to continuously change a logic gate from performing one 
logical function to another while remaining a functional 
gate in the interim. 
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FIG. 2: Left: Evolvability scores E for all 24 regulatory net- 
works studied. Networks are shown along the horizontal axis, 
ranked by E (sharp arrows denote up-regulation, and blunt 
arrows denote down-regulation). E values are calculated via 
Eqn. 1 10 1, with error bars showing the sampling error, calcu- 
lated as described in the text. Right: Two null distributions 
generated according to the null hypothesis that the function 
distance is independent of the parameter distance. The solid 
line is the distribution of E scores calculated from solution 
sets in which the locations in parameter space were held fixed, 
and the function assignments were randomly permuted. The 
dotted line is the distribution of E scores calculated from so- 
lution sets in which the locations in parameter space were 
held fixed, and the function assignments were drawn ran- 
domly from the set of possible functions for the given network. 
Both distributions are averages over the individual distribu- 
tions for each network, as there was no correlation between 
the means or variances of the individual distributions and the 
networks' E scores. 



For each network, mutual information / [Eqn. ([5])] was 
calculated along straight-line paths in parameter space 
between all solutions pairs within a randomly chosen sub- 
set of its optimally informative solutions. Examples of 
these paths are shown in Figure [3j\, for 10 solutions from 
the inset network. The solutions at either end are local 
maxima in /, and the paths show the loss in information 
capacity the network would suffer if it were to move from 
one solution to the other along a straight line in parame- 
ter space. Some information loss is unavoidable: chang- 
ing function requires reordering the output distributions 
(see Figure [Tj3), which means overlapping at least two 
of them at a time, and with 8 distributions the shift of 
two distributions from fully separated to fully overlapped 
incurs a minimum loss of at 0.25 bits. Seven of the 10 
functions corresponding to the 10 solutions in Figure [3]A_ 
are unique; at least 91% of the plotted paths involve a 
change in function. 

Nonetheless, we find that the loss in information suf- 
fered in going between optimal solutions is surprisingly 
minimal. The right panel of Figure [3j\ shows the dis- 
tribution of minimal mutual information values Iq along 



the paths for the inset network, and Figure [3j3 shows the 
means and the standard deviations of Iq distributions 
for all networks. For only a few networks do a significant 
portion of the paths drop below 1.5 bits, and almost no 
paths drop below 1 bit. We note in passing that the net- 
works in Figure [3]3 are shown as in Figure [2] i.e. ranked 
by evolvability score E, and so Figure [3j3 also demon- 
strates that there is no significant correlation between Iq 
and E. 

We emphasize that Figure [3j3 represents a lower bound 
on minimum mutual information encountered in transi- 
tioning between solutions. It is by no means necessary 
(and is most likely biologically unrealistic) for a func- 
tional change to proceed via such uniform changes in 
biochemical parameters. It is more likely that there exist 
transition paths that are more optimal than the straight- 
line paths, and that the most optimal Iq distributions are 
actually shifted higher in information than those gener- 
ated here. Thus it is quite nontrivial (and it is further 
testament to their evolvability) that even along direct 
paths between optimal solutions these networks in most 
cases do not drop below 1.5 bits of processing ability, 
considering that the solutions themselves operate in the 
range of ~2 — 2.8 bits. A network can be evolving and 
functional at the same time. 



DISCUSSION 

We have quantified the concept of evolvability in the 
context of regulatory networks by introducing an in- 
terpretable measure, and by probing the space of the 
networks' most informative functions. Our measure is 
an anti-correlation between the amount of functional 
change experienced by a network and the parametric 
change required to effect it, such that more evolvable net- 
works explore more diverse functions with smaller varia- 
tion in their biochemical parameters. We have fully de- 
fined functional and parametric distances (as well as the 
characterization of 'function' itself) in the context of a 
stochastic description of the experimental setup of Guet 
et al. |3], and we have chosen a correlation measure that 
is invariant to monotonic transformations in either defi- 
nition. 

We have found that all networks studied share the 
property that functional change is largely independent of 
parametric change, meaning that they arc highly evolv- 
able by our measure. This property holds for several dif- 
ferent definitions of function distance. This means that 
high-information functions are not organized in param- 
eter space in such a way that similar functions are near 
each other; instead nearby solutions are approximately as 
likely to be similar in function as they are to be different 
in function. 

Furthermore, we have found that all networks studied 
can transition among their maximally informative func- 
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FIG. 3: Changing function without losing information. A 
Left: Mutual information I along straight-line paths in pa- 
rameter space between pairs of 10 randomly chosen optimally 
informative model solutions for a particular network (inset). 
For each path, the starting and ending solution's locations in 
parameter space are denoted f?i and #2 respectively on the 
horizontal axis. The minimum mutual information Iq along 
each path is marked with a triangle. A specific function is 
performed at each of the 10 solutions (as characterized in 
Methods); 7 of the 10 functions are unique. Right: Distribu- 
tion of 7o values built from paths between 37 randomly chosen 
solutions for the inset network, of which the 10 solutions used 
for the left plot are a subset. B: Means (circles) and stan- 
dard deviations (error bars) of Iq distributions like that in A 
(right), for all networks studied; 37 randomly chosen solutions 
were used to build each network's distribution. Networks are 
shown on the horizontal axis, in the same order as in Figure 
[2j i.e. ranked by evolvability score E. 

tions without significant loss of information in the pro- 
cess. Along straight-line paths in parameter space be- 
tween functions (with mutual information values in the 
range ~2 — 2.8 bits), mutual information remains above 
~2 bits on average and very rarely drops below 1 bit. 
Moreover, these values represents a lower bound, since 
transition paths need not be straight. This suggests that 
the networks can evolve without losing functionality in 



the process, which resonates with the idea from evolu- 
tionary biology that evolution happens not by crossing 
high fitness barriers (low-information solutions in our 
case) , but by finding neutral paths [38 . 

Ultimately we have uncovered two important proper- 
ties of the regulatory networks described by our model: 
(a) high-information solutions do not cluster by function, 
and (b) transitions among solutions are possible without 
significant loss of fidelity. Both of these properties under- 
score the high evolvability of the networks studied. It is 
possible that these properties are general characteristics 
of a class of systems extending beyond small transcrip- 
tional regulatory networks, particularly systems governed 
by a large number of tunable parameters. However, we 
argue that these properties are especially relevant here, 
as they are critical to a quantitative description of the 
capacity of biological networks to evolve. 
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In networks in which the overall sign of the feedback cy- 
cle is negative, there can exist parameter values that sup- 
port multiple stable fixed points. This would correspond 
to one or more of the output distributions being multi- 
modal. Since we effectively minimize overlap of output 
states by optimizing information transmission, such so- 
lutions are rare (13% occurrence in all negative-feedback 
networks). When they do occur, we equally weight each 
fixed point in constructing the multimodal Gaussian out- 
put, and continue to define r by the ranks of the means 
of the output distributions. 
[41] If two solutions from the same local information max- 
imum are treated as distinct, they will have the same 
function but (slightly) different parameters; this will ar- 
tificially lower E. To correct for this effect, we merge (at 
their mean parameter location) nearest neighbors whose 
functions are the same until all nearest neighbors have 
different functions. This procedure reduced networks' so- 
lution sets by at most ~10%. 
[42] Many sources (including MATLAB's built-in corr) use 
an adjustment to the calculation of r in the case of tied 
data (see e.g. |39|). In keeping with the interpretation 
of our statistic as a probability, we do not introduce an 



adjustment; we simply count each tied pair as neither 
concordant nor discordant (i.e. if, for example, in com- 
puting the fraction of concordant pairs, we assigned each 
concordant pair a 1 and each discordant pair a 0, a tied 
pair would count as 0.5). 
[43] Not all 8! rankings of the output distributions are allowed 
functions for a given network. As shown in previous work 
|13| . the topology of the network constrains the set of pos- 
sible steady-state functions. Specifically, since each gene 
is regulated by one other gene, allowed functions are "di- 
rect" functions: those in which the output distribution 
responds to a change in inducer concentration according 
to the direct path from inducer to reporter (i.e., ignor- 
ing feedback pathways). For example, for the network 

in Fig. [ljA, in going from state [ ] (i=l) to [ — I — ] 

(i=3), sb increases; the direct path from sb to G consists 
of a repression-repression-repression chain, which is net 
repressive, so the output distribution must decrease (as 
it does in both panels of Fig. |TJ3) . With 3 inducers, there 
are 48 direct functions for each network; this is the set 
from which functions are randomly drawn in the second 
implementation of the null hypothesis. 



